Discriminative Strategies to Integrate Multiword Expression Recognition and Parsing

نویسندگان

  • Matthieu Constant
  • Anthony Sigogne
  • Patrick Watrin
چکیده

The integration of multiword expressions in a parsing procedure has been shown to improve accuracy in an artificial context where such expressions have been perfectly pre-identified. This paper evaluates two empirical strategies to integrate multiword units in a real constituency parsing context and shows that the results are not as promising as has sometimes been suggested. Firstly, we show that pregrouping multiword expressions before parsing with a state-of-the-art recognizer improves multiword recognition accuracy and unlabeled attachment score. However, it has no statistically significant impact in terms of F-score as incorrect multiword expression recognition has important side effects on parsing. Secondly, integrating multiword expressions in the parser grammar followed by a reranker specific to such expressions slightly improves all evaluation metrics.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Strategies for Contiguous Multiword Expression Analysis and Dependency Parsing

In this paper, we investigate various strategies to predict both syntactic dependency parsing and contiguous multiword expression (MWE) recognition, testing them on the dependency version of French Treebank (Abeillé and Barrier, 2004), as instantiated in the SPMRL Shared Task (Seddah et al., 2013). Our work focuses on using an alternative representation of syntactically regular MWEs, which capt...

متن کامل

Accounting for Contiguous Multiword Expressions in Shallow Parsing

In this paper, we focus on chunking including contiguous multiword expression recognition, namely super-chunking. In particular, we present different strategies to improve a superchunker based on Conditional Random Fields by combining it with a finite-state symbolic super-chunker driven by lexical and grammatical resources. We display a substantial gain of 7.6 points in terms of overall accuracy.

متن کامل

Stratégies discriminantes pour intégrer la reconnaissance des mots composés dans un analyseur syntaxique en constituants

We propose two discriminative strategies to integrate compound word recognition in a parsing context: (i) compound pregrouping with Conditional Random Fields before parsing, (ii) reranking parses with a maximum entropy model after parsing. These discriminative models integrate features dedicated to compounds, some of theme being computed from external lexical resources. We show that the pregrou...

متن کامل

Parsing Models for Identifying Multiword Expressions

Multiword expressions lie at the syntax/semantics interface and have motivated alternative theories of syntax like Construction Grammar. Until now, however, syntactic analysis and multiword expression identification have been modeled separately in natural language processing. We develop two structured prediction models for joint parsing and multiword expression identification. The first is base...

متن کامل

Deep Lexical Segmentation and Syntactic Parsing in the Easy-First Dependency Framework

We explore the consequences of representing token segmentations as hierarchical structures (trees) for the task of Multiword Expression (MWE) recognition, in isolation or in combination with dependency parsing. We propose a novel representation of token segmentation as trees on tokens, resembling dependency trees. Given this new representation, we present and evaluate two different architecture...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012